14 research outputs found
Learning Speech Emotion Representations in the Quaternion Domain
The modeling of human emotion expression in speech signals is an important,
yet challenging task. The high resource demand of speech emotion recognition
models, combined with the the general scarcity of emotion-labelled data are
obstacles to the development and application of effective solutions in this
field. In this paper, we present an approach to jointly circumvent these
difficulties. Our method, named RH-emo, is a novel semi-supervised architecture
aimed at extracting quaternion embeddings from real-valued monoaural
spectrograms, enabling the use of quaternion-valued networks for speech emotion
recognition tasks. RH-emo is a hybrid real/quaternion autoencoder network that
consists of a real-valued encoder in parallel to a real-valued emotion
classifier and a quaternion-valued decoder. On the one hand, the classifier
permits to optimize each latent axis of the embeddings for the classification
of a specific emotion-related characteristic: valence, arousal, dominance and
overall emotion. On the other hand, the quaternion reconstruction enables the
latent dimension to develop intra-channel correlations that are required for an
effective representation as a quaternion entity. We test our approach on speech
emotion recognition tasks using four popular datasets: Iemocap, Ravdess, EmoDb
and Tess, comparing the performance of three well-established real-valued CNN
architectures (AlexNet, ResNet-50, VGG) and their quaternion-valued equivalent
fed with the embeddings created with RH-emo. We obtain a consistent improvement
in the test accuracy for all datasets, while drastically reducing the
resources' demand of models. Moreover, we performed additional experiments and
ablation studies that confirm the effectiveness of our approach. The RH-emo
repository is available at: https://github.com/ispamm/rhemo.Comment: Paper Submitted to IEEE/ACM Transactions on Audio, Speech and
Language Processin
L3DAS21 Challenge: Machine Learning for 3D Audio Signal Processing
The L3DAS21 Challenge is aimed at encouraging and fostering collaborative
research on machine learning for 3D audio signal processing, with particular
focus on 3D speech enhancement (SE) and 3D sound localization and detection
(SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65
hours 3D audio corpus, accompanied with a Python API that facilitates the data
usage and results submission stage. Usually, machine learning approaches to 3D
audio tasks are based on single-perspective Ambisonics recordings or on arrays
of single-capsule microphones. We propose, instead, a novel multichannel audio
configuration based multiple-source and multiple-perspective Ambisonics
recordings, performed with an array of two first-order Ambisonics microphones.
To the best of our knowledge, it is the first time that a dual-mic Ambisonics
configuration is used for these tasks. We provide baseline models and results
for both tasks, obtained with state-of-the-art architectures: FaSNet for SE and
SELDNet for SELD. This report is aimed at providing all needed information to
participate in the L3DAS21 Challenge, illustrating the details of the L3DAS21
dataset, the challenge tasks and the baseline models.Comment: Documentation paper for the L3DAS21 Challenge for IEEE MLSP 2021.
Further information on www.l3das.com/mlsp202
Euclid Near Infrared Spectrometer and Photometer instrument concept and first test results obtained for different breadboards models at the end of phase C
The Euclid mission objective is to understand why the expansion of the Universe is accelerating through by mapping the geometry of the dark Universe by investigating the distance-redshift relationship and tracing the evolution of cosmic structures. The Euclid project is part of ESA's Cosmic Vision program with its launch planned for 2020 (ref [1]). The NISP (Near Infrared Spectrometer and Photometer) is one of the two Euclid instruments and is operating in the near-IR spectral region (900- 2000nm) as a photometer and spectrometer. The instrument is composed of: - a cold (135K) optomechanical subsystem consisting of a Silicon carbide structure, an optical assembly (corrector and camera lens), a filter wheel mechanism, a grism wheel mechanism, a calibration unit and a thermal control system - a detection subsystem based on a mosaic of 16 HAWAII2RG cooled to 95K with their front-end readout electronic cooled to 140K, integrated on a mechanical focal plane structure made with molybdenum and aluminum. The detection subsystem is mounted on the optomechanical subsystem structure - a warm electronic subsystem (280K) composed of a data processing / detector control unit and of an instrument control unit that interfaces with the spacecraft via a 1553 bus for command and control and via Spacewire links for science data This presentation describes the architecture of the instrument at the end of the phase C (Detailed Design Review), the expected performance, the technological key challenges and preliminary test results obtained for different NISP subsystem breadboards and for the NISP Structural and Thermal model (STM)
Psychometric Properties and Correlates of Precarious Manhood Beliefs in 62 Nations
Precarious manhood beliefs portray manhood, relative to womanhood, as a social status that is hard to earn, easy to lose, and proven via public action. Here, we present cross-cultural data on a brief measure of precarious manhood beliefs (the Precarious Manhood Beliefs scale [PMB]) that covaries meaningfully with other cross-culturally validated gender ideologies and with country-level indices of gender equality and human development. Using data from university samples in 62 countries across 13 world regions (N = 33,417), we demonstrate: (1) the psychometric isomorphism of the PMB (i.e., its comparability in meaning and statistical properties across the individual and country levels); (2) the PMB’s distinctness from, and associations with, ambivalent sexism and ambivalence toward men; and (3) associations of the PMB with nation-level gender equality and human development. Findings are discussed in terms of their statistical and theoretical implications for understanding widely-held beliefs about the precariousness of the male gender role
L3DAS21 challenge: machine learning for 3D audio signal processing
The L3DAS21 Challenge11www.13das.com/mlsp2021 is aimed at encouraging and fostering collaborative research on machine learning for 3D audio signal processing, with particular focus on 3D speech enhancement (SE) and 3D sound localization and detection (SELD). Alongside with the challenge, we release the L3DAS21 dataset, a 65 hours 3D audio corpus, accompanied with a Python API that facilitates the data usage and results submission stage. Usually, machine learning approaches to 3D audio tasks are based on single-perspective Ambisonics recordings or on arrays of single-capsule microphones. We propose, instead, a novel multichannel audio configuration based multiple-source and multiple-perspective Ambisonics recordings, performed with an array of two first-order Ambisonics microphones. To the best of our knowledge, it is the first time that a dualmic Ambisonics configuration is used for these tasks. We provide baseline models and results for both tasks, obtained with state-of-The-Art architectures: FaSNet for SE and SELDnet for SELD
Polymer memories: Bistable electrical switching and device performance
Polymer48185182-5201POLM
Gendered Self-Views Across 62 Countries: A Test of Competing Models
Social role theory posits that binary gender gaps in agency and communion should be larger in less egalitarian countries, reflecting these countries’ more pronounced sex-based power divisions. Conversely, evolutionary and self-construal theorists suggest that gender gaps in agency and communion should be larger in more egalitarian countries, reflecting the greater autonomy support and flexible self-construction processes present in these countries. Using data from 62 countries ( N = 28,640), we examine binary gender gaps in agentic and communal self-views as a function of country-level objective gender equality (the Global Gender Gap Index) and subjective distributions of social power (the Power Distance Index). Findings show that in more egalitarian countries, gender gaps in agency are smaller and gender gaps in communality are larger. These patterns are driven primarily by cross-country differences in men’s self-views and by the Power Distance Index (PDI) more robustly than the Global Gender Gap Index (GGGI). We consider possible causes and implications of these findings.</p